Belief Flows of Robust Online Learning
نویسندگان
چکیده
This paper introduces a new probabilistic model for online learning which dynamically incorporates information from stochastic gradients of an arbitrary loss function. Similar to probabilistic filtering, the model maintains a Gaussian belief over the optimal weight parameters. Unlike traditional Bayesian updates, the model incorporates a small number of gradient evaluations at locations chosen using Thompson sampling, making it computationally tractable. The belief is then transformed via a linear flow field which optimally updates the belief distribution using rules derived from information theoretic principles. Several versions of the algorithm are shown using different constraints on the flow field and compared with conventional online learning algorithms. Results are given for several classification tasks including logistic regression and multilayer neural networks.
منابع مشابه
Online Belief Propagation for Topic Modeling
Not only can online topic modeling algorithms extract topics from big data streams with constant memory requirements, but also can detect topic shifts as the data stream flows. Fast convergence speed is a desired property for batch learning topic models such as latent Dirichlet allocation (LDA), which can further facilitate developing fast online topic modeling algorithms for big data streams. ...
متن کاملAn Online Q-learning Based Multi-Agent LFC for a Multi-Area Multi-Source Power System Including Distributed Energy Resources
This paper presents an online two-stage Q-learning based multi-agent (MA) controller for load frequency control (LFC) in an interconnected multi-area multi-source power system integrated with distributed energy resources (DERs). The proposed control strategy consists of two stages. The first stage is employed a PID controller which its parameters are designed using sine cosine optimization (SCO...
متن کاملPlanning Robust Strategies for Constructing Multi-object Arrangements
A crucial challenge in robotics is achieving reliable results in spite of sensing and control uncertainty. In this work, we explore the conformant planning approach to robot manipulation. In particular, we tackle the problem of pushing multiple objects simultaneously to achieve a specified arrangement without external sensing. Conformant planning is a belief-state planning problem. A belief sta...
متن کاملReinforcement Learning Based PID Control of Wind Energy Conversion Systems
In this paper an adaptive PID controller for Wind Energy Conversion Systems (WECS) has been developed. Theadaptation technique applied to this controller is based on Reinforcement Learning (RL) theory. Nonlinearcharacteristics of wind variations as plant input, wind turbine structure and generator operational behaviordemand for high quality adaptive controller to ensure both robust stability an...
متن کاملROBUST RESOURCE-CONSTRAINED PROJECT SCHEDULING WITH UNCERTAIN-BUT-BOUNDED ACTIVITY DURATIONS AND CASH FLOWS II. SOUNDS OF SILENCE: A NEW SAMPLING-BASED HYBRID PRIMARY-SECONDARY CRITERIA HARMONY SEARCH METAHEURISTIC
In this paper, we present a new idea for robust project scheduling combined with a cost-oriented uncertainty investigation. The result of the new approach is a makespan minimal robust proactive schedule, which is immune against the uncertainties in the activity durations and which can be evaluated from a cost-oriented point of view on the set of the uncertain-but-bounded duration and cost param...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1505.07067 شماره
صفحات -
تاریخ انتشار 2015